Skip to content

Conversation

@BrianRothermich
Copy link
Contributor

@BrianRothermich BrianRothermich commented May 19, 2025

Relates to an effort to consolidate the stateless merge scheduler with the current (stateful) merge scheduler from main ES. This PR brings over features required to maintain parity with the stateless scheduler. Specifically, a few methods are added for the stateless scheduler to override:

  • Adds an overridable method shouldSkipMerge to test for skipping merges
  • Adds 2 additional lifecycle callbacks to the scheduler for when a merge is enqueued and when a merge is executed or aborted. This is used by stateless to track active + queued merges per-shard
  • Adds overridable methods for enabling/disabling IO/thread/merge count throttling

Other functionality required by the stateless merge scheduler can use the existing callbacks from the stateful scheduler:

  • beforeMerge can be overridden to prewarm
  • afterMerge can be overridden to refresh after big merges

Relates ES-10264

@BrianRothermich BrianRothermich marked this pull request as ready for review May 19, 2025 19:28
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label May 19, 2025
@BrianRothermich BrianRothermich added >non-issue :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. and removed needs:triage Requires assignment of a team area label labels May 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Indexing Meta label for Distributed Indexing team label May 19, 2025
@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label May 22, 2025
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments for a less intrusive change by instead extending the merge scheduler in stateless, overriding a few specific methods.

ThreadPoolMergeExecutorService threadPoolMergeExecutorService,
MergeMemoryEstimateProvider mergeMemoryEstimateProvider
MergeMemoryEstimateProvider mergeMemoryEstimateProvider,
BooleanSupplier shouldSkipMerge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add a method shouldSkipMerge returning false here that we then implement in a stateless specific subclass instead?

}

MergeSchedulerConfig(boolean autoThrottle, int maxThreadCount, int maxMergeCount) {
this.autoThrottle = autoThrottle;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than overriding it like this, could we instead add ThreadPoolMergeScheduler.isAutoThrottle() that returns config.isAutoThrottle() by default but is overloaded in stateless to return false?

Similar for the two other fields.

generationThresholdSize = scopedSettings.get(INDEX_TRANSLOG_GENERATION_THRESHOLD_SIZE_SETTING);
flushAfterMergeThresholdSize = scopedSettings.get(INDEX_FLUSH_AFTER_MERGE_THRESHOLD_SIZE_SETTING);
mergeSchedulerConfig = new MergeSchedulerConfig(this);
if (DiscoveryNode.isStateless(nodeSettings)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid this if we can and added suggestions to use a sub-class for the merge scheduler in stateless.

@henningandersen henningandersen self-requested a review May 22, 2025 18:15
@BrianRothermich
Copy link
Contributor Author

BrianRothermich commented May 22, 2025

I added comments for a less intrusive change by instead extending the merge scheduler in stateless, overriding a few specific methods.

This feels like a cleaner approach - thanks for the suggestion. Updated to use overridable methods

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@BrianRothermich BrianRothermich removed the serverless-linked Added by automation, don't add manually label May 22, 2025
@BrianRothermich BrianRothermich merged commit 1a641e5 into elastic:main May 22, 2025
18 checks passed
@BrianRothermich BrianRothermich deleted the brothermich/ES-10264 branch May 22, 2025 20:53
albertzaharovits pushed a commit to albertzaharovits/elasticsearch that referenced this pull request Jun 9, 2025
Relates to an effort to consolidate the stateless merge scheduler with the current (stateful) merge scheduler from main ES. This PR brings over features required to maintain parity with the stateless scheduler. Specifically, a few methods are added for the stateless scheduler to override:

Adds an overridable method shouldSkipMerge to test for skipping merges
Adds 2 additional lifecycle callbacks to the scheduler for when a merge is enqueued and when a merge is executed or aborted. This is used by stateless to track active + queued merges per-shard
Adds overridable methods for enabling/disabling IO/thread/merge count throttling

Other functionality required by the stateless merge scheduler can use the existing callbacks from the stateful scheduler:

beforeMerge can be overridden to prewarm
afterMerge can be overridden to refresh after big merges

Relates ES-10264
---------

Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >non-issue Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants